19 research outputs found

    Ontology driven information retrieval.

    Get PDF
    Ontology-driven information retrieval deals with the use of entities specified in domain ontologies to enhance search and browse. The entities or concepts of lightweight ontological resources are traditionally used to index resources in specialised domains. Indexing with concepts is often achieved manually and reusing them to enhance search remains a challenge. Other challenges range from the difficulty in merging multiple ontologies for use in retrieval to the problem of integrating concept-based search into existing search systems. We mainly encounter these challenges in enterprise search environments, which have not kept pace with Web search engines and mostly rely on full-text search systems. Full-text search systems are keyword-based and suffer from well-known vocabulary mismatch problems. Ontologies model domain knowledge and have the potential for use in understanding the unstructured content of documents. In this thesis, we investigate the challenges of using domain ontologies for enhancing search in enterprise systems. Firstly, we investigate methods for annotating documents by identifying the best concepts that represent their contents. We explore ways to overcome the challenges of insufficient textual features in lightweight ontologies and introduce an unsupervised method for annotating documents based on generating concept descriptors from external resources. Specifically, we augment concepts with descriptive textual content by exploiting the taxonomic structure of an ontology to ensure that we generate useful descriptors. Secondly, the need often arises for cross-ontology reasoning when using multiple ontologies in ontology-driven search. Once again, we attempt to overcome the absence of rich features in lightweight ontologies by exploring the use of background knowledge for the alignment process. We propose novel ontology alignment techniques which integrate string metrics, semantic features, and term weights for discovering diverse correspondence types in supervised and unsupervised ontology alignment. Thirdly, we investigate different representational schemes for queries and documents and explore semantic ranking models using conceptual representations. Accordingly, we propose a semantic ranking model that incorporates the knowledge of concept relatedness and a predictive model to apply semantic ranking only when it is deemed beneficial for retrieval. Finally, we conduct comprehensive evaluations of the proposed methods and discuss our findings

    Introducing Clood CBR: a cloud based CBR framework.

    Get PDF
    CBR applications have been deployed in a wide range of sectors, from pharmaceuticals; to defence and aerospace to IoT and transportation, to poetry and music generation; for example. However, a majority of applications have been built using monolithic architectures which impose size and complexity constraints. Such applications have a barrier to adopting new technologies and remain prohibitively expensive in both time and cost because changes in frameworks or languages affect the application directly. To address this challenge, we introduce a distributed and highly scalable generic CBR framework, Clood, which is based on a microservices architecture. Microservices architecture splits the application into a set of smaller, interconnected services that scale to meet varying demands. Experimental results show that our Clood implementation retrieves cases at a fairly consistent rate as the casebase grows by several orders of magnitude and was over 3,700 times faster than a comparable monolithic CBR system when retrieving from half a million cases. Microservices are cloud-native architectures and with the rapid increase in cloud-computing adoption, it is timely for the CBR community to have access to such a framework. Video Link: https://youtu.be/CkuehJPEQ

    Ontology alignment based on word embedding and random forest classification.

    Get PDF
    Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component for realising the goals of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. However, these techniques mostly depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding to discover semantic similarities between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts match. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can deal with knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of multiple similarity measures. Our experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems

    Counterfactual explanations for student outcome prediction with Moodle footprints.

    Get PDF
    Counterfactual explanations focus on “actionable knowledge” to help end-users understand how a machine learning outcome could be changed to one that is more desirable. For this purpose a counterfactual explainer needs to be able to reason with similarity knowledge in order to discover input dependencies that relate to outcome changes. Identifying the minimum subset of feature changes to action a change in the decision is an interesting challenge for counterfactual explainers. In this paper we show how feature relevance based explainers (such as LIME), can be combined with a counterfactual explainer to identify the minimum sub-set of “actionable features”. We demonstrate our hybrid approach on a real-world use case on student outcome prediction using data from the Campus Moodle Virtual Learning environment. Our preliminary results demonstrate that counterfactual feature weighting to be a viable strategy that should be adopted to minimise the number of actionable changes

    Actionable feature discovery in counterfactuals using feature relevance explainers.

    Get PDF
    Counterfactual explanations focus on 'actionable knowledge' to help end-users understand how a Machine Learning model outcome could be changed to a more desirable outcome. For this purpose a counterfactual explainer needs to be able to reason with similarity knowledge in order to discover input dependencies that relate to outcome changes. Identifying the minimum subset of feature changes to action a change in the decision is an interesting challenge for counterfactual explainers. In this paper we show how feature relevance based explainers (i.e. LIME, SHAP), can inform a counterfactual explainer to identify the minimum subset of 'actionable features'. We demonstrate our DisCERN (Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods) algorithm on three datasets and compare against the widely used counterfactual approach DiCE. Our preliminary results show that DisCERN to be a viable strategy that should be adopted to minimise the actionable changes

    DisCERN: discovering counterfactual explanations using relevance features from neighbourhoods.

    Get PDF
    Counterfactual explanations focus on 'actionable knowledge' to help end-users understand how a machine learning outcome could be changed to a more desirable outcome. For this purpose a counterfactual explainer needs to discover input dependencies that relate to outcome changes. Identifying the minimum subset of feature changes needed to action an output change in the decision is an interesting challenge for counterfactual explainers. The DisCERN algorithm introduced in this paper is a case-based counter-factual explainer. Here counterfactuals are formed by replacing feature values from a nearest unlike neighbour (NUN) until an actionable change is observed. We show how widely adopted feature relevance-based explainers (i.e. LIME, SHAP), can inform DisCERN to identify the minimum subset of 'actionable features'. We demonstrate our DisCERN algorithm on five datasets in a comparative study with the widely used optimisation-based counterfactual approach DiCE. Our results demonstrate that DisCERN is an effective strategy to minimise actionable changes necessary to create good counterfactual explanations

    Conceptual modelling of explanation experiences through the iSeeonto ontology.

    Get PDF
    Explainable Artificial Intelligence is a big research field required in many situations where we need to understand Artificial Intelligence behaviour. However, each explanation need is unique which makes it difficult to apply explanation techniques and solutions that are already implemented when faced with a new problem. Therefore, the task to implement an explanation system can be very challenging because we need to take the AI model into account, user's needs and goals, available data, suitable explainers, etc. In this work, we propose a formal model to define and orchestrate all the elements involved in an explanation system, and make a novel contribution regarding the formalisation of this model as the iSeeOnto ontology. This ontology not only enables the conceptualisation of a wide range of explanation systems, but also supports the application of Case-Based Reasoning as a knowledge transfer approach that reuses previous explanation experiences from unrelated domains. To demonstrate the suitability of the proposed model, we present an exhaustive validation by classifying reference explanation systems found in the literature into the iSeeOnto ontology

    Ontology alignment based on word embedding and random forest classification

    Get PDF
    Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component for realising the goals of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. However, these techniques mostly depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding to discover semantic similarities between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts match. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can deal with knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of multiple similarity measures. Our experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems

    Taxonomic corpus-based concept summary generation for document annotation

    Get PDF
    Semantic annotation is an enabling technology which links documents to concepts that unambiguously describe their content. Annotation improves access to document contents for both humans and software agents. However, the annotation process is a challenging task as annotators often have to select from thousands of potentially relevant concepts from controlled vocabularies. The best approaches to assist in this task rely on reusing the annotations of an annotated corpus. In the absence of a pre-annotated corpus, alternative approaches suffer due to insufficient descriptive texts for concepts in most vocabularies. In this paper, we propose an unsupervised method for recommending document annotations based on generating node descriptors from an external corpus. We exploit knowledge of the taxonomic structure of a thesaurus to ensure that effective descriptors (concept summaries) are generated for concepts. Our evaluation on recommending annotations show that the content that we generate effectively represents the concepts. Also, our approach outperforms those which rely on information from a thesaurus alone and is comparable with supervised approaches
    corecore